OFFER: Off-Environment Reinforcement Learning

نویسندگان

  • Kamil Andrzej Ciosek
  • Shimon Whiteson
چکیده

Policy gradient methods have been widely applied in reinforcement learning. For reasons of safety and cost, learning is often conducted using a simulator. However, learning in simulation does not traditionally utilise the opportunity to improve learning by adjusting certain environment variables – state features that are randomly determined by the environment in a physical setting but controllable in a simulator. Exploiting environment variables is crucial in domains containing significant rare events (SREs), e.g., unusual wind conditions that can crash a helicopter, which are rarely observed under random sampling but have a considerable impact on expected return. We propose off environment reinforcement learning (OFFER), which addresses such cases by simultaneously optimising the policy and a proposal distribution over environment variables. We prove that OFFER converges to a locally optimal policy and show experimentally that it learns better and faster than a policy gradient baseline.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Off-Environment RL with Rare Events

Policy gradient methods have been widely applied in reinforcement learning. For reasons of safety and cost, learning is often conducted using a simulator. However, learning in simulation does not traditionally utilise the opportunity to improve learning by adjusting certain environment variables – state features that are randomly determined by the environment in a physical setting but controlla...

متن کامل

Development of Reinforcement Learning Algorithm to Study the Capacity Withholding in Electricity Energy Markets

This paper addresses the possibility of capacity withholding by energy producers, who seek to increase the market price and their own profits. The energy market is simulated as an iterative game, where each state game corresponds to an hourly energy auction with uniform pricing mechanism. The producers are modeled as agents that interact with their environment through reinforcement learning (RL...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

Reinforcement learning for energy conservation and comfort in buildings

This paper deals with the issue of achieving comfort in buildings with minimal energy consumption. Specifically a reinforcement learning controller is developed and simulated using the Matlab/Simulink environment. The reinforcement learning signal used is a function of the thermal comfort of the building occupants, the indoor air quality and the energy consumption. This controller is then compa...

متن کامل

Locomotion Planning with 3D Character Animations by Combining Reinforcement Learning Based and Fuzzy Motion Planners

Motion and locomotion planning have a wide area of usage in different fields. Locomotion planning with premade character animations has been highly noticed in recent years. Reinforcement Learning presents promising ways to create motion planners using premade character animations. Although RL-based motion planners offer great ways to control character animations but they have some problems that...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017